Bayesian Subtree Alignment Model based on Dependency Trees
نویسندگان
چکیده
Word sequential alignment models work well for similar language pairs, but they are quite inadequate for distant language pairs. It is difficult to align words or phrases of distant languages with high accuracy without structural information of the sentences. In this paper, we propose a Bayesian subtree alignment model that incorporates dependency relations between subtrees in dependency tree structures on both sides. The dependency relation model is a kind of tree-based reordering model, and can handle non-local reorderings, which sequential word-based models often cannot handle properly. The model is also capable of handling multilevel structures, making it possible to find many-to-many correspondences automatically without any heuristic rules. The size of the structures is controlled by nonparametric Bayesian priors. Experimental alignment results show that our model achieves 3.5 points better alignment error rate for English-Japanese than the word sequential alignment model, thereby verifying that the use of dependency information is effective for structurally different language pairs.
منابع مشابه
Bitext Dependency Parsing with Bilingual Subtree Constraints
This paper proposes a dependency parsing method that uses bilingual constraints to improve the accuracy of parsing bilingual texts (bitexts). In our method, a targetside tree fragment that corresponds to a source-side tree fragment is identified via word alignment and mapping rules that are automatically learned. Then it is verified by checking the subtree list that is collected from large scal...
متن کاملAn Improved Extraction Pattern Representation Model for Automatic IE Pattern Acquisition
Several approaches have been described for the automatic unsupervised acquisition of patterns for information extraction. Each approach is based on a particular model for the patterns to be acquired, such as a predicate-argument structure or a dependency chain. The effect of these alternative models has not been previously studied. In this paper, we compare the prior models and introduce a new ...
متن کاملA generalization of Profile Hidden Markov Model (PHMM) using one-by-one dependency between sequences
The Profile Hidden Markov Model (PHMM) can be poor at capturing dependency between observations because of the statistical assumptions it makes. To overcome this limitation, the dependency between residues in a multiple sequence alignment (MSA) which is the representative of a PHMM can be combined with the PHMM. Based on the fact that sequences appearing in the final MSA are written based on th...
متن کاملDiscovering Constrained Substructures in Bayesian Trees Using the E.M. Algorithm
In this paper, we present an Expectation-Maximization learning algorithm (E.M.) for estimating parameters of partially-constrained Bayesian trees. The Bayesian trees considered here consist of an unconstrained subtree and a set of constrained subtrees. In this tree structure, constraints are imposed on some of the parameters of the parametrized conditional distributions, such that all condition...
متن کاملEBMT system of kyoto university in OLYMPICS task at IWSLT 2012
This paper describes the EBMT system of Kyoto University that participated in the OLYMPICS task at IWSLT 2012. When translating very different language pairs such as Chinese-English, it is very important to handle sentences in tree structures to overcome the difference. Many recent studies incorporate tree structures in some parts of translation process, but not all the way from model training ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011